[livres divers classés par sujet] [Informatique] [Algorithmique] [Programmation] [Mathématiques] [Hardware] [Robotique] [Langage] [Intelligence artificielle] [Réseaux]
[Bases de données] [Télécommunications] [Chimie] [Médecine] [Astronomie] [Astrophysique] [Films scientifiques] [Histoire] [Géographie] [Littérature]

Processing Frequent Itemset Discovery Queries by Division and Set Containment Join Operators

contributor	Anwendersoftware (IPVR)

creator	Rantzau, Ralf
date	2003-06

description	SQL-based data mining algorithms are rarely used in practice today. Most performance experiments have shown that SQL-based approaches are inferior to main-memory algorithms. Nevertheless, database vendors try to integrate analysis functionalities to some extent into their query execution and optimization components in order to narrow the gap between data and processing. Such a database support is particularly important when data mining applications need to analyze very large datasets or when they need access current data, not a possibly outdated copy of it. We investigate approaches based on SQL for the problem of finding frequent itemsets in a transaction table, including an algorithm that we recently proposed, called Quiver, which employs universal and existential quantifications. This approach employs a table schema for itemsets that is similar to the commonly used vertical layout for transactions: each item of an itemset is stored in a separate row. We argue that expressing the frequent itemset discovery problem using quantifications offers interesting opportunities to process such queries using set containment join or set containment division operators, which are not yet available in commercial database systems. Initial performance experiments reveal that Quiver cannot be processed efficiently by commercial DBMS. However, our experiments with query execution plans that use operators realizing set containment tests suggest that an efficient processing of Quiver is possible.
format	application/pdf
	202152 Bytes

identifier

http://www.informatik.uni-stuttgart.de/cgi-bin/NCSTRL/NCSTRL_view.pl?id=INPROC-2003-02&engl=1

language	eng
publisher	Rensselaer Polytechnic Institute, Troy, New York 12180-3590, USA
relation	Report No. 03-05
source	In: Zaki, Mohammed (ed.); Aggarwal, Charu (ed.): Proceedings of the ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD), San Diego, California, USA, June 13, 2003, pp. 20-27
	ftp://ftp.informatik.uni-stuttgart.de/pub/library/ncstrl.ustuttgart_fi/INPROC-2003-02/INPROC-2003-02.pdf
subject	Database Management Systems (CR H.2.4)
	Database Applications (CR H.2.8)
	association rule discovery
	relational division
	set containment join
title	Processing Frequent Itemset Discovery Queries by Division and Set Containment Join Operators
type	Text
	Article in Proceedings